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Extmding  die  Rule  Space  Modd 

Extrading  the  Rule  Model  to  a  Setnantically-Rich  Domain: 

Diagnostic  Assessment  in  Architecture 

Abstract 

This  presents  a  technique  for  applying  the  Rule  Space  model  of  cognitive  diagnosis 
(Tatsuoka,  1^3)  to  assessment  in  a  semantically-rich  domaiiL  Responses  to  22  architecture  test 
items,  developed  to  assess  a  range  of  aichitectu^  knowledge,  were  analyzed  using  Rule  Space. 
Verbal  protocol  analyses  guided  the  constmctiai  of  a  model  of  examinee  performance,  ctxisisting 
processes  for  constructing  an  initial  representation  of  an  item  (labeled  understand),  forming 
go^  and  performing  actions  based  on  those  goals  (solve),  and  determining  whether  goals  have 
beoi  attra^pted  and  satisfied  (check).  Item  attributes,  derived  firom  these  laocesses,  formed  the 
basis  ^diagnosis.  Our  technique  extends  Rule  Space’s  applicability  by  defining  attributes  in 
terms  oi  item  diaractetistics  and  die  causal  relations  between  characteristics  and  ^  problon- 
solving  model 

Data  were  collected  from  122  architects  of  various  ability  levels  (students,  architecture 
interns,  and  professional  architects).  Rule  Space  successfully  classifi«l  approximately  65%,  90%, 
and  40%  of  examinees  based,  resp^vely,  on  attributes  associated  with  Ae  understand,  solve,  and 
check  processes  of  the  problem-solving  n^el  The  findings  support  the  effectiveness  of  Rule 
Space  in  a  complex  domain  and  suggest  directicHis  for  developing  new  architecture  items  by  using 
attributes  particularly  effective  at  distinguishing  among  examinees  of  different  ability  levels. 


Index  terms:  diagnose  assessment;  problems  solving;  architecture;  rule  q>ace;  item  attributes; 
computer-based  testing 
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Extending  die  Rule  Space  Model  to  a  Cmi^lex  Domain: 

Diagnostic  Assessment  in  Architecture 

As  testing  programs  begin  to  employ  new  forms  of  assessment,  a  comnxxi  go^  is  to 
cc»struct  tests  whose  demands  are  closely  related  to  tasks  in  the  target  domain  (V^ggins,  1989). 
While  recent  research  has  presented  sev^  types  of  assessment  tasJb  (e.g.,  simulation)  that  more 
accurately  capture  retevant  knowledge  and  slolls.  there  remains  die  issue  of  perfomanrerqxiiting: 
How  can  we  provide  examinees  with  inftxmatimi  beytxid  scores  of  overall  proficiency, 
infixmation^  captures  the  richness  oflmowledge  and  skills  in  a  domain?  In  the  current  work, 
we  enqpk^  die  Rule  Space  Model  (Tatsudra,  1983)  to  ^nerate  descriptions  of  examinee  ability 
diat  are  fix  ridher  than  those  normally  derived  fiom  large-scale  assessmem.  However,  Rule  Space 
has  been  noost  successfully  ^lied  in  the  past  only  to  relatively  narrow  tc^s  in  well-defined 
donuins  (e.g.,  mixed  nurhber  subtraction,  single-variable  isolation  in  algebra).  This  pt^ 
presents  a  technique  for  applying  die  Rule  Space  model  of  cognitive  diagnosis  (TatsiMka,  1983)  to 
a  semantically-rich  domain  in  n^  of  more  audientic,  yet  tractable,  assessments:  architecture. 

Architecture  Assessment 

Current  architecture  assessments  consist  primarily  of  short,  verbal  multiple-choice 
questions  or  cotrplex  items  that  mimic  the  tasks  architects  ixinnally  encounter  in  the  woricplace. 
Because  arcMtecture  is  a  complex  domain,  individuals*  scores  on  relatively  siniple,  vex^  nu^le- 
choioe  tests  do  not  oqiture  die  complexity  of  the  knowledge  and  skills  to  be  assessed.  We  addr^ 
these  issues  by  presrating  examinees  with  figuralrespixise  test  items  (Martinez.  1991;  in  press) 
and  by  generating  diagnostic  profiles  of  examiitees  based  on  their  peifonnance  using  die  ^e 
Space  model  (Tatsuoka,  1983). 

The  figural  re^nse  items  used  in  this  study  differ  firmn  standard  mult^le-choice  items  in 
that  examinees  must  construct  their  answers  and  the  responses  consist  of  the  generation  or 
mampulaticm  of  figural  material  (e.g.,  ^phs,  pictures).  Figural  re^nse  items  are  e^edally 
suited  to  domains  that  are  graphical  or  pictorial  in  nature;  the  domain  of  architecture  is  a  natural 
candidate  for  this  fonn  of  assessment  The  approach  of  using  figural  re^xxise  items  for 
architecture  assessment  has  a  number  of  advantages.  First,  architecture  is  a  gnqdiioil  domain; 
designs  are  drawn,  rather  than  essays  being  written.  Thus,  the  figural  response  format  provkles  a 
natE^  way  for  architects  to  express  dieir  abiliQr.  Second,  constructed  response  items  may  be  able 
to  tqi  skills  odierwise  inaccessible  using  the  multiple-choice  format  Mart^ez  &  Katz  (1^) 
showed,  fix  exanqile,  that  different  skills  are  fiequendy  tapped  by  figural  response  items  coixqiared 
widi  their  multiple-choice  countetparts. 

fii  this  study,  the  figural  respcmse  items  woe  computer  delivered;  a  sample  item  is  shown 
in  Hgure  1.  Each  item  consists  of  a  stem  (top  of  screen),  a  diagram,  and  a  set  of  tools  for  drawing 
on  or  manipulating  the  diagram.  The  item  in  Figure  1  requires  examinees  to  move  die  structures  at 
the  bottom  of  the  screen  (libi^,  parking  lot,  and  playground)  cm  to  the  provided  site,  subject  to 
die  er^lidit  constraints  stated  in  dw  item  stem  as  well  as  to  die  implicit  ccmstraints  diat  architects 
associate  widi  libraries,  parking  lots,  and  playgrounds  (e.g.,  a  playground  shcmld  not  be  adjacent 
to  a  parking  lot;  a  parking  lot  must  have  street  access). 


Insert  Hgure  1  about  here 


Architecture  brings  certain  challenges  to  the  practice  of  large  scale  assessment  First  much 
of  architecmnal  practice  requires  desi^,  a  notoriously  complex  cognitive  skill.  The  duration  of 
design  projects  in  architecture  are  typically  measiued  in  days  or  months,  not  minutes  as  with  the 
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usual  exanaination  item.  Also,  design  tasks  do  not  typically  have  “right”  or  “wrong”  answers. 
Rather,  a  continuum  of  designs  satisfy  the  constraints  of  the  task  to  a  greater  or  lesser  extent 
Further,  in  the  real  worid,  ctmstraints  on  a  design  task  are  not  immutable;  often  the  architect  may 
relax  certain  initially  spedfied  constraints  that  he  or  she  believes  would  allow  for  a  better  design 
(Goel  &  Pirolli,  1991).  We  do  not  seek  to  assess  design  skills  directly.  Although  scxne  of  tiie 
figural  re^Kxise  items  present  simple  design  tasks,  most  were  meant  to  assess  architectural 
knowlei^  throu^  subsidiary  taslu.  For  example,  two  items  present  a  diagram  of  a  building  and 
a^  ti»B  candidate  to  specify  locaticms  of  seismic  joints.  While  a  ccxresptmding  task  set  for  an 
architect  mi^t  not  be  tius  sinple,  tite  task  could  come  up  as  part  of  a  larger  design  task  in  die  real 
world. 


Architecture  may  be  classified  as  a  “semantically  rich  domain”  (Simon,  1984)  in  that  skilled 
pofocmance  involves  extensive  specialized  knowledge.  Architecture  knowled^  is  usually  gained 
over  several  years  of  intense  study.  This  knowledge  comes  from  a  variety  of  (hsciplines,  including 
civil  engineering,  physics,  history,  psychology,  construction,  and  art  This  femns  a  second 
challenge  for  architectural  assessment  Optimally,  assessment  will  produce  similariy  rich 
descriptions  of  proficiency  based  (»i  test  performance.  In  die  current  work,  we  employ  the  Rule 
Space  Model  (Tatsuoka,  1983)  to  generate  descriptions  of  examinee  ability  that  are  far  richer  than 
diose  normally  derived  from  large-scale  assessment 

Our  approach,  like  that  of  many  ema^g  test  theories,  blends  traditional  psychometric 
approaches  with  devdopments  in  costive  psychology  (Gitomer  &  Yamanx>tt>,  1991).  Some  new 
approaches  including  Rule  Space  build  on  item  response  dieory  (IRT),  in  which  individuals  and 
items  are  ordered  along  a  profidency  continuum  (Lord  &  Novick,  1959).  One  well-known 
shortcoming  of  IRT  is  that  ktentical  estimates  of  overall  ixofidency  may  be  derived  from  radically 
different  response  patterns.  If  information  about  response  patterns  could  be  simplified  and 
preserved,  these  rich  descriptitms  of  performance  could  be  truly  diagnostic  (Mislevy,  1993). 

The  Rule  Space  Model 

The  Rule  Space  model  provides  descriptions  of  examinee  performance  that  extend  beycxid 
raw  scores  or  uni-(timensional  IRT  estimates  of  overall  profidency.  Items  are  decomp^ed  into 
attributes,  which  rei»esent  the  latent  traits  that  the  items  assess.  Based  on  an  examinee’s  pattern  of 
coiect  arid  incortea  responses,  the  Rule  Space  model  infers  tiie  most  likely  combinatitm  df 
attributes  the  examinee  has  mastered. 

The  diagnosis  of  cognitive  errors  made  by  examinees  is  a  pattern  classificaticHi  problera  In 
this  study,  the  patterns  are  item  response  vectors,  and  the  vectors  are  ones  and  zeroes  indicating 
correct  and  incorrect  responses,  respectively.  The  response  vectors  are  classified  as  various 
correct  latent  knowledge  states.  The  Rule  Space  rtxxlel,  developed  to  solve  this  classificatitxi 
problem,  has  tiiree  steps:  (1)  determination  of  classification  groups,  (2)  formulation  of  a 
classification  space,  a^  (3)  classification  of  examinees’  responses. 

Detennination  of  Oassification  Groups 

We  assume  that  each  postulated  cognitive  attribute — declarative  knowledge,  cognitive 
processes,  solution  strategies,  and  so  forth— is  tapped  by  at  least  one  item  in  the  pool.  The 
relaticmship  between  these  cognitive  attributes  and  the  items  is  expressed  by  an  incidoice  matrix  Q, 
whose  order  is  the  number  of  cognitive  attributes  k  by  the  number  of  items  n.  If  item  j  involves 
attribute  k,  then  Qjg  =  1,  otherwise  (^j  =  0.  Each  item  is  therefore  characterized  by  tiie  cognitive 
attributes  required  ior  its  solution. 
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For  exanqile,  suppose  dieie  are  three  items  whose  two  underlying  attributes  are  denoted  Aj 
and  A2.  Further,  suppose  Aj  is  needed  to  solve  items  1  and  3,  and  A2  is  required  in  item  2. 

Then,  the  incidence  matrix  Q  (2x3)  is: 

Items 

Attribute  A1  10  1 
Attribute  A2  0  10 

IMth  three  items,  diere  are  eight  possible  response  vectors: 

(0,0,0),  (1,0,0),  (0,1,0),  (0,0,1),  (1,1,0),  (1,0,1),  (0,1,1),  (1,1,1). 

Given  two  attributes,  there  are  four  possible  examinee  knowledge  states: 

State  1.  Examinee  cannot  do  A^^  but  can  do  A2 
State  2.  Examinee  cannot  do  A2,  but  can  do  Ax 
State  3.  Examinee  cannot  do  Ax  nor  A2 
State  4.  Examinee  can  do  Ax  and  A2 

There  are  four  ideal  response  vectors  conforming  to  die  four  states: 

State  1.  (0,1,0) 

State  2.  (1,0,1) 

State  3.  (0,0,0) 

State  4.  (1,1,1) 

Note  that  each  ideal  resixinse  vectOT  corresponds  to  a  unique  vector  of  mastered  attrilnites. 
The  remaining  [lossible  response  vectors — (1,0,0),  (0,0,1),  (1,1,0),  (0,1,1) —  do  not  conform 
precisely  to  any  of  the  models.  The  secdcm  entided  Oassification  of  Examinees*  Responses 
discusses  Rule  Space’s  treatment  of  such  “non-ideal”  resixmse  vectors. 

Tatsuoka  (1991)  and  Varandi  &  Tatsuoka  (1990)  developed  an  algorithm  to  produce  all 
Xiossible  ideal  response  [lattems,  corresponding  to  all  x)ossible  latent  knowledge  states  from  an 
incidence  matrix  Q.  The  numba  of  states  is  deteimined  from  the  number  of  attributes,  the  number 
items,  and  die  <kgree  of  attribute  nesting.  In  applying  Rule  Space  to  other  data  sets,  the  number 
of  l^ent  states  has  c^n  exceeded  1000. 

The  aassificatiCT  Spate 

In  order  to  jneserve  continuity  with  current  psychometric  thecnies,  the  classification  space 
was  formulated  as  a  two-dimensional  Cartesian  product  space  of  the  IRT  jiroficiency  parameter  6, 

and  an  index  of  the  unusualness  of  an  item  response  pattern  where  “unusualness”  refers  to  the 
degree  to  which  easier  items  are  answered  inccarecdy  and  difficult  items  are  answered  corteedy 
(Tatsuoka  A  Linn,  1981;  Tatsuoka,  1984;  1985;  19^;  Tatsuoka  &  Tatsuoka,  1987).  When  an 
examinee’s  response  vector  confemns  well  to  the  average  performances  on  the  test  items,  the 

absolute  value  of  C  be  nearly  zero.  When  C-values  of  a  knowledge  state  are  close  to  zero,  that 

is,  close  to  die  6-axis,  we  can  expea  that  many  examinees  will  be  diagnosed  to  have  that 
knowledge  state.  If  the  ^-value  associated  with  a  knowledge  state  is  lai;ge,  positively  or 
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negatively,  then  we  expect  that  state  to  be  unusual  in  the  sense  diat  few  examiiKes  will  be 
diagnosed  as  having  that  knowledge  state. 

aassification  of  Examinees*  Responses 

Examinees*  peiformances  on  test  items  ate  ncx  always  ctxisistent  with  their  unobservable 
patterns  of  attribute  mastery.  Responses  that  Aviate  from  an  itteal  response  pattern  are  assumed  to 
contain  random  errors  or  slips.  Under  die  assumpticm  that  occurrences  of  slips  on  items  are 
independent  across  items,  Tatsuoka  Sc  Tatsuoka  (1987)  showed  diat  the  distrilaidon  of  the  number 
slips  frdlow  a  binomial  distribution  if  die  slippage  probabilities  are  the  same  across  die  items, 
and  follow  a  compound  binomial  (hstribution  if  the  slippage  probabilities  differ  across  items. 

When  the  non-ideal  response  patterns  associated  with  a  particular  ideal  pattern,  R,  are 
nu^iped  into  the  Rule  Space  (by  computing  their  6  and  C  values),  they  form  a  unique  subset  that 
swarms  around  the  point  (Oj^,  ^i^).  The  swarm  of  mapped  points  in  the  Rule  Space  follows 

approximately  a  multivariate  ntamal  distribution  with  a  centroid  of  (Bj^,  ^j^),  and  is  called  the  laig 
distrihutifin  or  state  distribution  associated  with  respcoise  pattern  R  (Tatsuoka,  1990).  When  all 
possible  ideal  item  response  patterns  are  mapped  on  to  the  Rule  Space,  (xie  can  apply  Bayes* 

decision  rules  for  determining  the  minimum  errors  to  classify  an  examinee’s  point  (9x,  Cx) 
one  of  the  possible  latent  states.  More  detailed  discussions  of  the  classification  procedure  can  be 
found  in  Tatsuoka  (1990),  Tatsuoka  &  Tatsuoka  (1987, 1989),  and  Sheehan,  Tatsuoka,  &  Lewis 
(1991). 


Applying  Rule  Space  to  Architecture  Assessment 

The  items  used  in  this  research  were  intended  to  assess  a  wide  range  of  architectural 
knowledge  and  skills  across  several  subdisciplines  of  architecture.  Different  items  required 
different  problem-solving  operations.  For  example,  some  items  required  examinees  to  specify  the 
Ifftqierties  structinal  elements  while  otiiers  requi^  the  proper  airangemmt  of  architectural 

elements  cm  the  computer.  The  range  of  opetaticxis  used  across  items  implied  that  defining 
attributes  in  terms  of  low-level  operations  would  produce  an  attribute  set  witii  little  overlap  across 
items.  This  would  defeat  the  purpose  of  the  Rule  Space.  We  therefore  analyzed  the  architecture 
items  at  a  coarser  grain,  using  attritnites  descriptive  of  higher-level  processing  as  suggested  by  a 
general  model  of  problem  solving.  This  apprc^h  requir^  a  modification  to  the  procedure  u^  in 
other  Rule  Space  analyses.  We  ^t  defuied  a  cognitive  model  that  was  general  enou^  to  account 
for  problem-solving  behavior  on  all  items.  Attribute  deffnitions  were  then  based  on  the  model.  In 
tile  next  secticm,  we  describe  the  cognitive  model  and  our  procedure  for  defining  item  attributes. 

The  Cognitive  Model 

Our  cognitive  model  was  derived  in  part  from  a  theory  of  computer  interface  use  (Lewis  & 
Poison,  1990).  This  model  was  chosen  because  of  ostensible  similarities  between  problem  solving 
in  user  interface  evaluation  and  solution  of  figural  response  items.  Our  adaptation  of  Lewis  and 
Poison’s  model  was  based  on  vobal  protocols  from  one  pilot  subject  who  solved  aU  22 
architecture  items^ .  The  analysis  of  protocols  from  a  single  subject  was  not  used  to  produce  a 
definitive  cognitive  model,  but  a  hypothesized  model  which  would  guide  us  in  devel^ing 
reasonable  attributes.  The  reasonableness  of  this  hypothesized  model  could,  in  turn,  be  supported 
or  failed  our  data. 


^This  pilot  subject  was  not  pan  of  the  test  administration  discussed  in  the  next  section. 
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The  mo(fel  consists  of  processes  relevant  for  constructing  an  initial  representatioa  of  die 
item  (ie.,  understanding  the  ]^blem  stem  and  provided  diagram),  forming  goals  and  performing 
actkxis  based  on  those  goals  (Le.,  solving  the  item),  and  determining  wheAer  goals  have  been 
sad^ed  and  if  they  have  beat  satisfied  correctly  (i.e.,  checking  each  problem  solving  step  and  the 
final  answer).  The  model  asserts  that  these  processes  exist,  but  makes  no  claims  as  to  di^  cado-. 
Fcaexan^le,  anntamineemightcometoanewundnstandingc^aproblanafteraQemiaingto 
solve  it  or  ato  checking  an  initial,  incorrect  solution.  The  processes  hypothesized  by  tte  na^l 
are  summarized  in  Table  1. 


Insert  Table  1  about  here 


Understand.  The  first  step  in  solving  any  item  is  to  understand  what  is  being  asked  so  that 
the  apprcqnriate  knowledge  can  be  invoked.  Each  figural  response  item  consisted  of  both  a  verbal 
stem  ^  a  diagram,  the  4tter  of  which  may  contain  both  graphical  and  verbal  informaticm.  Thus, 
understand  processes  include:  (a)  reading  and  interpreting  the  verbal  stem,  (b)  scanning  and 
interpreting  the  diagram,  and  (c)  relating  the  information  in  the  stem  and  (tiagram  to  one’s  own 
knowledge.  This  processing  ^ows  the  examinee  to  form  initial  goals,  and  either  a  plan  for 
solving  the  item  or  a  set  of  heuristics.  An  initial  goal  noight  be  to  apply  a  strategy  learned  in  die 
classromn  or  to  invoke  a  general  problem-solving  medu^  such  as  means-ends  analysis,  in  which 
me  chooses  at  each  step  an  action  that  will  reduce  the  difference  between  the  current  state  of  the 
problem  and  the  desired  goal  state.  In  specifying  the  understand  processes — ^read  stem,  scan 
diagram,  and  relate  to  me’s  own  knowMge — no  claims  are  made  as  to  either  the  ordering  of  die 
processes  or  the  conditions  under  which  they  occur.  Particular  items  will  be  less  or  more  difikult 
in  terms  of,  say,  reading  and  interpreting  the  stem,  and  it  is  just  these  sorts  of  diffaences  which 
form  the  basis  for  the  item  attribute  definitions. 

Solve.  Once  an  initial  representation  of  the  problem  has  been  built,  and  the  initial  goals 
formed,  the  examinee  must  perfmm  the  actions  that  lead  to  solving  the  prc^lem.  Of  course,  while 
solving  a  problem,  an  examinee  may  reformulate  or  refine  an  initiri  representation  of  an  iton.  The 
processes  involved  in  solving  an  item  are  applied  to  each  goal  that  has  not  yet  been  satisfied.  Each 
of  diese  goals  may  be  elaborated  by  forming  subgoals  of  die  currendy  active  goal  or  the  examinee 
may  perform  an  action  that  will  satisfy  the  current  goal.  An  action  may  be  physical,  such  as 
drawing  a  line,  or  cognitive,  such  as  finding  a  level  area  on  a  contour  map.  These  two  processes, 
elaboration  of  goals  and  performance  of  actions,  do  not  determine  precisely  how  a  particular  item  is 
solved.  Certain  questicxis  are  left  open.  For  example,  which  subgoals  are  formed  when  a 
particular  goal  is  elaborated?  How  does  the  examinee  decide  on  which  actitxis  to  perform  to  satisfy 
a  goal?  Answering  these  questions  requires  a  knowledge  of  the  particular  strategies  used  to  solve 
each  itoiL  Whatever  strategy  an  examinee  uses  (whether  problem-specific  or  gmeral),  that 
strategy  will  determine  which  goals  are  attended  to  and  in  what  order,  and  what  subgods  are 
fcnm^ 

Check.  Once  an  action  has  been  performed,  the  results  of  that  action  may  be  evaluated  to 
ensure  that  the  action  was  perfcnmed  correctly  and  that  it  satisfies  the  original  gosiL  If  both 
conditions  ate  met,  die  examinee  may  mark  that  goal  as  finished  (perhaps  by  saying  something  to 
the  effect  of  “Okay,  that’s  done”),  and  proceed  to  the  next  unsatisfied  goal.  Thus,  two  types  of 
evaluations  may  occur  monitoring  whether  an  action  has  been  carried  out  as  planned  and  noting 
whether  it  satisfies  the  original  gc^ . 
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Attribatg  Creation 

Because  the  figural  response  items  were  designed  to  assess  a  wide  range  of  architectural 
knowledge  and  skill,  defining  attributes  in  terms  of  the  actual  steps  candidates  take  in  solving  the 
items  (the  iq>i]ix>ach  used  in  previous  applications  of  Rule  Space)  was  contra-indicated.  Instead, 
we  defined  attributes  in  terms  of  item  characteristics  or  features.  Each  item  has  multiple  features 
and  could  be  clasafied  along  several  dimensions,  but  fa-  purposes  of  attribute  creation  we 
identified  titose  features  with  a  potential  causal  connectitMi  to  examinee  p^onnance.  The  attributes 
were  defined  by  identifying  features  of  the  items  that  could  be  expected  either  to  help  or  hintfer 
problem-solving.  Fot  exaii^le,  we  hypothesized  that  problem  solving  would  be  hindered  during 
the  process  “scan  the  provki^  diagram,”  if  the  diagram  was  a  specialized  gra^h  (e.g.,  a 
topographic  map)  titat  would  not  be  understood  by  all  examinees.  The  38  attributes  identified  in 
tire  task  analysis  are  listed  in  Table  2.  To  illustrate  the  assignment  of  attributes  to  items.  Table  3 
shows  tire  attributes  associated  with  the  “library”  item  of  Figure  1  altxig  with  an  explan^on  of 
why  Aat  attribute  was  assigned. 

Each  attribute  is  associated  with  one  or  more  of  the  three  types  of  processing  (understand, 
solve,  and  check),  and  those  assignments  are  shown  in  Table  4.  The  assignment  of  attributes  to 
process  was  made  by  two  independent  judges  with  an  inter-rater  agreement  of  88%. 

Disagreements  were  settled  through  discussion  between  the  judges.  Two  independent  judges  also 
detenrtined  the  subset  of  elementary  cogrutive  attributes  needed  to  solve  each  question.  The  inter- 
rater  reliability  for  this  process  was  again  88%.  As  before,  disagreements  were  settled  through 
discussion  between  the  judges. 


Insert  Tables  2, 3,  and  4  Here 


Method 


Materials  and  Design 

Twenty-two  figural  response  questions  were  constructed  to  draw  upon  skills  needed 
throughout  the  broad  ctxitent  of  an  architectural  licensing  examination.  These  questions  were 
developed  for  presentation  on  a  computer  with  responses  made  through  mouse  movements  and 
clicks.  The  questions  were  divided  into  two  eleven-item  subsets,  and  each  subset  was 
admirustered  to  a  random  half  of  the  available  subjects^ . 

Suljjsszis 

Subjects  (N=122)  were  selected  from  three  status  groups:  practicing  architects  (N=34), 
architecture  interns  (N=35),  and  architecture  students  (N=53).  The  eleven  item  responses 
provided  by  each  subjea  were  scored  correctfincorrect  and  modeled  with  a  two-parameter  logistic 

IRT  model  Maximum  likelihood  estimates  of  proficiency  (9)  were  subsequently  obtained  for  each 
subject  These  estinoates  were  used  to  classify  subjects  into  three  equal-sized  proficiency  groups. 
The  cross-tabulation  of  status  groups  and  proficiency  groups  is  shown  in  Table  5. 


^Subjects  scdved  only  eleven  of  the  figural  response  items  because  they  were  also  administered  a  set  of 
complementary  multiple-chmce  items.  Time  constraints  did  not  permit  additicmal  testing.  Contrasts  between  item 
sets  are  rqxnted  in  another  ^dy  (Martinez  &  Katz,  1992). 
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Insert  Table  5  about  here 


Procedure 

In  groups  of  six,  subjects  were  given  a  verbal  introduction  to  the  item  delivery  system. 
Following  diat,  ^y  each  attempted  the  items  individually  on  a  computer.  Of  the  122  subjects, 
three  sublets  generated  verbal  protocols  to  gather  independent  support  for  the  cognitive  model. 

To  generate  tlw  protocols,  the  subjects  were  asked  to  “think  aloixl"  (Ericsson  &  Simon,  1984), 
saying  anything  that  they  would  normally  “say”  to  themselves  as  they  solved  the  items. 

Rule  Space  Analyses 

Rule  Space  analyses  were  conducted  separately  fOT  each  of  the  three  groups  of  problem¬ 
solving  attributes  identified  above.  This  strate^  was  chosen  for  two  reasons.  Ctee  ve^  practical 
reasOT  is  diat  the  combination  of  attributes  made  the  possible  number  of  knowledge  states 
astrtaiomical  for  the  entire  set  of  38  attributes,  thus  the  total  pool  of  attributes  had  to  be  sub¬ 
divided.  A  second  reason  was  to  contrast  attribute  clusters  in  their  ability  to  classify  examinees. 

Rule  Space  was  carried  out  in  two  steps:  I^t,  the  BUGLIB  con^uter  program  (Varandi 
&  TatsucAuu  1990)  was  used  to  determine  die  set  of  all  possible  latent  knowledge  states  associated 
with  Ae  specified  stage;  second,  the  RULESPACS  computer  program  (Tatsuol^  Bailie  Sc 
ShMhan,  1990)  was  us^  to  classify  subjects  into  one  of  the  l^wledge  states.  Three  attempts 
were  ma;^  to  classify  each  examinee,  one  for  each  of  the  three  problem-solving  process  typ« 
(understands  solve,  and  gheck). 


Results 


Verbal  Protocol  Results 

Our  cognidve  model  postulated  that  certain  processes  would  be  used  as  a  subject  solved  die 
architecture  items.  One  way  to  gather  evidence  for  the  model  is  to  show  that  these  processes  are 
sufficient  for  explaining  the  verbalizations  made  by  subjects  (Ericsson  &  Simon,  1984).  Hght 
categories  of  su^ect  vobalizations  were  defined,  one  category  for  each  process  in  the  cognitive 
model  and  a  “miscellaneous”  category.  These  categories  woe  defined  thiou^  examining 
verbalizations  of  the  pilot  subjea  as  she  solved  eleven  of  the  items.  The  sufficiency  of  the 
categories  was  established  by  attenc^ting  to  categorize  the  verbalizations  on  die  remaining  eleven 
itons.  One  rater  categorized  all  of  die  sut^ea’s  verbalizations,  while  another  rater  independendy 
categorized  a  portiem  of  the  verbalizations.  The  inter-rater  agreement  on  the  portiem  scored  by  both 
raters  was  82%.  The  final  categories  are  shown  in  Table  6.  The  verbalizations  encoded  as 
miscellaneous  include  single  words  or  short  phrases  C‘Okay,”  “Let’s  see”),  statements  concerning 
the  conputer  interface  C*I  have  to  click  twice”),  and  statements  irrelevant  to  the  task. 


Insert  Table  6  about  here 


The  categcxizatimi  scheme  was  applied  to  the  verbal  reports  of  the  three  protocol  subjects. 
The  cognitive  mi^l  accounted  for  7 1%  of  the  votializations  made  by  subjects;  the  remaining 
veibahzations  fell  into  the  miscellaneous  category.  This  result  suggests  that  the  model  adequately 
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ci^tuml  subjects’  problem-solving  performance,  and  thus  supports  the  validity  of  the  cognitive 
attributes  created  frcxn  this  model. 

Rule  Space  Results 

The  projection  of  examinee  response  data  into  the  two-dimensional  Rule  Space  is  presented 
in  Figure  2.  Examinees’  6  values  are  plotted  along  the  x-axis;  ^  values  ate  plotted  along  the  y- 
axis.  The  symbols  indicate  status  group  membership.  The  plot  shows  that  practicing  architects  are 
located  mostly  in  the  medium  to  high  proficiency  region  and  form  a  cluster  that  is  distinct  from  the 
points  plotted  fctr  interns  and  students. 


Insert  Figure  2  about  here 


Each  examinee’s  perfomtance  was  diagnosed  three  times,  once  for  each  of  the  understand, 
solve,  and  check  attributes.  For  each  diagnosis,  the  examinee’s  point  in  the  rule  ^ace  was 
crxnpaied  to  the  points  corresponding  to  die  set  of  knowledge  states  associated  wiA  each  attribute 
group.  The  item/attribute  incidence  matrices  developed  for  each  problem-solving  process  type 
determined  the  number  of  possible  states:  803  fe  understand.  1208  for  solve,  and  121  for  check. 
Within  each  process  type,  each  knowledge  state  corresponded  to  a  unique  combination  of  mastered 
attribu^'s  and  is  represented  by  a  unique  point  in  the  Rule  Space. 

The  classifrcadon  results  for  each  of  the  three  types  of  problem-solving  processes  are 
presented  in  Table  7.  Within  each  process  type,  the  number  and  percentage  of  classified  examinees 
is  brdeen  down  by  IRT-proficiency  level  Qow,  medium,  and  hi^)  and  status  group  (student, 
intern,  architect).  Two  patterns  are  worth  noting.  The  ^t  is  diat  the  solve  attributes  are  the  noost 
powerful  in  clarifying  subjects  across  proficiency  levels  and  status  groups;  in  fact,  all  41  low- 
proficiency  examinees  were  classified.  The  next  noost  powerful  set  of  attributes  is  understand. 
foUowed  1^  check.  A  second  pattern  is  that,  almost  uniformly,  examinees  in  the  lower  proficiency 
or  status  groups  were  more  often  classified  than  those  in  the  higher  groups.  For  exanqrle,  twice 
the  percentage  of  low-proficiency  examinees  (61?o)  than  high-proficiency  examinees  (30%)  were 
classified  under  check. 


Insert  Table  7  about  here 


The  low  classification  rate  achieved  for  the  check  processes  is  considered  in  Figure  3.  In 
this  plot,  the  diamonds  stand  for  latent  knowledge  states  and  the  boxes  indicate  die  examinees’ 
diagnostic  location.  The  plot  shows  that  the  121  knowledge  states  deduced  from  the  check 
inc^ence  matrix  do  not  coincide  with  die  examinees’  points.  Thus,  the  attributes  defined  from  the 
check  portion  of  the  model  do  not  c^ture  examinee  behavior,  suggesting  that  examinee 
performance  is  not  greatly  differentiated  by  check  processes  (or  Aat  we  need  to  rewOTk  that  portion 
of  die  model). 


Insert  Rgure  3  here 


Attribute  Mastery  Pmbabilities 
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An  attribute  mastery  vector  was  esdmated  for  each  classified  examinee.  Tt^se  vectcnns  are 
con^x>sed  of  zeros  and  ones,  depending  on  whether  the  attribute  in  question  was  iix;luded  in  the 
subset  of  mastned  attributes  defined  for  the  examinee’s  state.  Attribute  mastery  patterns  were 
averaged  within  proficiency  and  status  groups,  and  analyzed  using  a  repeated  measures  analysis  of 
variance  design,  as  described  in  Sheehan,  Tatsuoka,  and  Lewis  (1991).^ 

P-values  for  the  analysis  of  variance  F-tests  are  reported  in  Table  8.  The  table  provides 
evidence  for  three  clearly  significant  effects:  proficiency  group,  attribute,  and  die  attribute 
prc^dency  group  interaction.  These  results  ate  reassuring  beoiuse  they  indicate  diat  the  attributes 
assodated  with  each  problem-solving  stage  are  difieraid^y  difficult  and  that  examinees  in 
differmt  profidency  groups  tend  to  have  different  attribute  mastery  profiles. 

The  results  obtained  for  the  status  group  classification  are  not  as  clear-cut  Aldiough  the 
main  effect  of  Status  group  is  clearly  not  significant  the  interaction  of  status  group  with  attribute  is 
marginally  significant  T^  indicates  that  die  average  probability  of  mastery  values  calculated  for 
some  attributes  differed  among  students,  interns,  and  practicing  architects,  but  diese  differences 
did  not  hold  up  after  averaging  over  all  attributes.  Thus,  on  the  average,  examinees  in  different 
status  groups  md  not  differ  in  their  mastery  of  the  elementary  cognitive  skills  identified  in  diis 
study. 


Insert  Table  8  about  here 


Table  9  tnesents  the  mean  probabilip^  of  mastery  values  estimated  for  the  solve  attributes. 
The  different  attribute  mastery  profiles  obtained  for  low,  medium  and  high  proficiency  examinees 
are  clearly  indicated.  The  table  also  shows  that  attributes  differ  in  discrimination.  For  example, 
conader  Ae  probabilities  listed  for  the  “environment”  attribute:  On  average,  low  proficiency 
examinees  mastered  this  attribute  with  a  probability  of  .47;  the  corresponding  probabilities  for 
nMdium  and  high  proficiency  examinees  are  .60  and  .97,  respectively.  The  varying  probabilities 
obtained  for  low,  medium,  and  high  proficiency  examinees  indicate  ^t  this  attribute  is  highly 
discriminating.  By  contrast,  the  tl^  mean  values  listed  for  the  “learned  procedure”  attribute  are 
all  very  similar.  Thus,  this  attribute  is  not  particularly  helpful  at  discriminating  among  examinees 
of  different  alnlity  levels. 


Insert  Table  9  about  here 


Discussion  and  Conclusions 

This  study  exemplifies  how  an  IRT-based  model  for  estintation  of  overall  proficiency  can 
be  cmnbined  with  the  diagnostic  classification  of  examinees.  The  results  of  the  application  of  Rule 
Space  were  satisfying:  We  were  able  to  classify  a  large  proportion  of  examinees,  especially  those 
of  low  and  medium  ability.  In  principle,  these  classifications  could  be  reported  back  to  examinees 


3 A  standard  analysis  of  variance  design  would  not  have  be«t  ^{Bopriaie  for  these  data  because  the  hypothesis  of 
multisample  q)hericily — that  is.  indq)endently  observed  attributes— is  violated.  The  violation  results  from  the  fact 
that,  instead  (tf  measuring  a  single  attribute  on  each  examinee,  our  design  involves  taking  38  attribute 
measurements.  Thus,  non-zero  correlations  are  expected  among  the  attribute  measuremmts  associated  with  a 
particular  examinee. 
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so  that  lemediaticHi  in  weak  areas  could  proceed.  Traditional  psychometrics  has  served  well  in 
disoiounating  amcmg  examinees  for  selection,  placement,  or  classificatitm  on  the  basis  of  global 

estimates  of  proficiency.  Rule  Space  provides  estimates  of  6,  but  also  yields  information  that 
could  serve  Ae  interests  of  the  examinee  in  pin-pointing  areas  of  non-mast^.  Of  course, 
applications  of  die  technique  described  in  this  paper  to  ^er  conoplex  dcxnains  may  require  a  much 
l^er  sample  aze  dian  was  used  in  the  current  smdy.  Data  from  a  relatively  small  number  of 
examinees  were  suff^ent  for  the  goal  of  this  paper,  which  was  to  demcmstrate  and  explain  a 
methodology  for  extending  Rule  Space. 

In  addititm  to  diagnosis  and  estimation  of  6,  Rule  Space  provides  a  framework  for 
ctxnpating  a  model  of  task  perfotmanoe  to  examinees’  response  data.  There  are  few  well-defined 
nMhodologies  for  comparing  models  to  data  (but  see  Polk  &  Newell,  1991),  especially  those  that 
can  accommodate  a  great  variety  of  individual  differences  in  examinees’  knowledge,  sldll,  and 
strategy  .  Model  testing  proceeds  as  follows:  On  the  basis  of  a  cognitive  noodel,  items  are  analyzed 
into  their  coirponent  cognitive  attributes.  The  resulting  itern/attcibute  matrix  (or  matrices)  leads  to 

strong  predictions  about  examinees’  response  patterns.  If  the  (6, 0  position  of  an  examinee’s 
re^nse  pattern  is  close  to  that  of  an  ideal  response  pattern,  that  examinee  is  classified  into  the 
knowledge  state  that  the  response  pattern  impUes.  To  the  extent  that  examinees’  response  patterns 
can  be  classified,  the  analysis  provides  support  for  the  cognitive  model.  There  are  of  course 
limitations  to  the  Rule  Space  method.  We  have  already  noted  that  sets  of  attributes  processed 
together  are  limited  in  size.  As  they  approach  25  or  so,  the  combinations  of  attribute  profiles 
m^es  the  possible  number  of  ideal  states  unmanageable.  Consequently,  the  attributes  must  be 
clustoed  and  run  separately  as  in  this  study. 

One  amtribution  of  this  work  is  that  we  have  outlined  a  methodology  for  applying  Rule 
Space  to  c(nzq)lex  domains.  Generally,  a  limitation  of  Rule  Space  is  tiiat  at  tiie  level  of  fine-grained 
a^ysis,  the  operations  needed  to  solve  items  in  a  complex  domain  may  not  overlap  a  great  deal. 
Many  attritmtes  might  in  fact  be  unique  to  particular  items  within  the  item  set  If  tius  is  the  case, 
the  cognitive  attributes  must  be  cast  at  a  higher-level  of  generality  such  as  item  characteristics  (e.g., 
type  of  diagram  presented)  or  general  problem-solving  approach  needed  to  solve  each  item  (e.g., 
recalling  a  fact  versus  applying  a  learned  procedure).  Given  more  general  attributes,  what  can  we 
say  about  an  examinee’s  p^ormance?  From  a  psychological  viewpoint,  tire  attributes  tell  us  little 
about  the  examinee’s  costive  competence.  But  frxrm  an  educational  standpoint,  the  attributes 
provide  examinees  with  just  the  information  they  need  to  improve  their  performance  on  subsequent 
tests.  The  attributes  allow  us  to  say  that  an  examinee  has  difficulties  wiA  i'ems  having  certain 
properties.  While  we  may  have  li^e  information  about  the  examinee’s  skill  at  a  fine-grained  level, 
tire  diagnostic  reports  (which  attributes  are  mastered  and  which  aren’t)  does  tell  the  examinee  what 
types  of  problems  they  should  seek  out  and  practice  solving,  and  what  compmients  of  problem 
solving  need  special  attention. 

Attributes  should  be  based  on  an  independently  constructed  problem-solving  rrxxlel. 
Analysis  of  verbal  protocols,  performed  in  this  work,  serves  as  one  means  for  constructing  and 
verifying  a  cognitive  model.  The  noodel  supports  attribute  creation  by  showing  which  aspects  of 
the  items  would  help  or  hinder  problem-solving  performance.  In  contrast  to  developing  a  list  of 
attributes  intuitively,  a  cognitive  model  provides  a  rich  description  of  each  attribute  because  the 
meaning  of  each  attribute  is  derived  from  its  place  in  the  model.  Methodologically,  this  rich 
attribute  description  promotes  a  fuller  understanding  of  what  each  attribute  means  and  facilitates  the 
assigning  of  attributes  to  items. 

Anodier  contribution  of  this  work  is  that  we  were  able  to  examine  the  power  of  attributes  to 
discriminate  amcmg  examinees  of  various  levels.  Knowing  which  attributes  are  highly 
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discriminating  has  value  for  die  construcdon  of  ittms  as  well  as  for  the  design  and  sequencing  of 
instructicHL  lUfferential  relevance  of  attributes  across  proficiency  groups  also  sheds  light  on  the 
nature  of  eiqieit^vice  (hfferences  in  the  domain  of  interest.  Rule  Space  holds  a  great  deal  of 
value  for  sad^ing  die  requirements  of  traditional  psychcmietrics  and  fOT  diagnosis  of  individual 
examinees.  Through  die  use  of  such  models,  psychometrics  has  much  to  offer  to  learners  and 
teachers  beyond  estimates  of  global  proficiency. 
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Table  1 

Problem  Solving  Model 


At&ibate  Group 

Processes 

Understanding  the  Item 

Read  the  item  ston 

Scan  the  diagram 

Recall  relevant  information 

Solving  the  Item 

Set  subgoals 

Perform  acticms 

Checking  Perfonnance 

Is  the  action  conect? 

Is  the  current  goad  con^leted? 
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Table  2 

Attribute  Pcfinitions 


Attribute  Name 

Relatitxis  amemg 
Attributes  in  a  Qass 

Qutracterisdes  of 
Presented  Figure 

Picture 

Presented  figure  is  a  sketdi 
of  an  acmal  object 

The  three  attributes  in  diis 
class  are  mutually 
exclusive  (if  an  item  has 

(Xie  attribute  in  this  class, 
by  definition  is  does  not 
have  anotiier  attribute  fimn 
tile  same  class)  and 
exhaustive  (all  of  tiie  items 
may  be  classified  as 
having  at  least  one  of  the 
attributes  in  this  class) 

Ingram 

Presented  figure  is  an 
abstract  diagram  of  an  object 

Specialized 

Diagram 

Presented  figure  is  a  graph 
or  chart- a  visual 
r^iresentatirai  of  srane 
information 

Qarity  of  General 
Task 

Ingram 

obvious 

Based  on  just  the  presented 
figure,  its  possible  for 
someone  to  understand  what 
task  the  item  is  asking  them 
toperfemn.  Details 
regarding  the  task  included 
in  the  item  stem  might  still 
be  needed  for  cotiect 
performance  of  the  task. 

Mutually  exclusive,  but 
not  exhaustive 

Own  obvious 

Based  on  the  presented 
figure  along  with  some  prior 
knowledge,  it’s  possible  for 
someone  to  understand  what 
task  the  item  is  asking  them 
toperfeam.  Details 
regarding  the  tarit  included 
in  the  item  stem  might  still 
be  needed  for  correct 

performance  of  the  task 
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ProUein-solving  Declarative  Requires  knowing  particular  Mutually  exclusive  and 
requirements  of  item  architectural  symbols  and  exhaustive 

detinidons  for  conect 
solution. 


Learned  Requires  due  plication  of 

Procedure  fairiy  standard,  algorithmic 

procedures  diat  usually 
would  have  been  lean^ 

_ previously. _ 

Discovered  Requires  die  aj^licadon  of 

Strategy  knowledge  or  procedures  in 

a  novel  way.  These  itmns 

_ are  more  puzzle-like. _ 

Content  area  Site  Design  The  item  tests  knowledge  or  Mutually  exclusive  and 

skills  associated  with  one  of  exhaustive 
the  recognized 
subdisciplines  of 
architecture  listed  to  the  left 


Structural 

Technology 

(General) 

Snuctiiral 
Technology 
(Lateral  Fo^s) 

Materials  and 
Methods 


Construction 

Documents 
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Particular 

Architectural  Features 

Identify  Street 

Correct  problem  solving 
requires  that  the  candidate 
can  recognize  a  street  on  a 
site  plan. 

Neither  mutually  exclusive 
nor  exhaustive 

Environment 

Correct  problem  solving 
requires  tiiat  the  candidate 
knows  about  constraints  due 
to  environmental  factors 
(e.g.,  weather,  earthquakes) 

Q>ntour  Lines 

Requires  the  ability  to  read 
and  interpret  contour  lines. 

Forces 

Requires  the  alnlity  to 
recognize,  interpret,  and  use 
force  vectors. 

General  Problem- 
scriving  Approach 

Read  and 
Translate 

Problem  solving  goes 
through  cycles  of  getting 
informatics  from  tiie 
problem  stem,  using  that 
information  to  generate  part 
of  the  answer,  and  then 
repeating. 

Mutually  exclusive,  but 
not  exhaustive 

Indicate 
Location  of 
New  Feature 

Problem  solving  involves 
placing  given  elements  into 
new  positions  or  adding 
information  to  the  provided 
diagram. 

Response  Method 

Move/Rotate 

Requires  arrangement  of 
provided  elements. 

Exhaustive,  but  not 
mutually  exclusive 

Label 

Requires  selecting  which  of 
a  provided  set  of  labels 
should  be  placed  at  various 
indicated  points  on  the 
diagram. 

Draw  Line 

Requires  drawing  of  lines 
onto  provided  diagram. 

Draw  Arrow 

Requires  drawing  of  arrows 
onto  provided  diagram. 
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Misleading 

Characteristics 

Stem  Incorrect 

Without  detailed  lowwledge 
of  an  item  type,  the  item’s 
stem  suggests  an  inconect 
problem-solving  method. 

Mutually  exclusive,  but 
notexhwstive 

IMagram 

Incorrect 

Without  detailed  knowledge 
an  item  type  or  diagram 
type,  Ae  item’s  provided 
Aagram  suggests  mcoirect 
problem-solving  meAods. 

Relatkm  between 
Stem  and  Problem¬ 
solving 

Stem 

Independent 

The  item  stem  provides 
practically  no  information 
Aat  could  not  be  gained 
either  through  prior 
Imowledge  or  through  Ae 
provided  figure. 

Stem  mdependent  and 

Stem  depenctent  are 
mutually  exclusive  and 
exhaustive. 

Stem  Dependent  Problem-solving  is 
necessarily  bas^  on 
information  presented  in  the 
item  stem.  This  category  is 

Ae  unicm  of  “Initial  Info” 
and  “Interim  Info” 
categories. 

Initial 

Information  in 
Stem 

While  Ae  stem  informaticMi 
is  necessary  for  correct 
solution,  Aat  information  is 
not  directly  required  during 
Ae  course  of  problem 
solving. 

Initial  info  m  stem  and 
Interim  info  m  stem  are 
mutually  exclusive  and 
exhaustive  across  Stem 
dependent  items. 

Interim 
Information  in 
Stem 

The  information  in  Ae  stem 
is  needed  a  number  of  times 
during  the  course  of  cmrect 
problem-solving. 
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CcHxqiletion  Criteria 

Own 

Knowledge 

St(^ 

Examinees  must  use  their 
own  knowledge  to  doride 
whetha  they  are  finished 
responding  to  an  item  (i.e., 
if  the  answer  is  complae). 
Neither  die  stem  na  die 
diagram  dhecdy  supply  diis 
informaticMi. 

Mutually  exclusive  and 
exhaustive 

Diagram  Stop 

The  provided  diagram 
indicates  whether  an  answer 
is  complete. 

Diagram  and 
Own 

Knowledge 

Stop 

The  provided  diagram  along 
with  sane  spedahzed 
knowledge  indicates 
whether  an  answer  is 
complete. 

Stem  Stq) 

Information  provided  in  the 
stem  indicates  whedier  a 
given  answer  is  conplete. 

Number  Conea 
Responses 

One  Correct 

The  item  has  only  one 
correct  answer. 

Mutually  exclusive  and 
exhaustive 

Few  Correa 

The  item  has  two  or  diree 
correct  answers,  which  are 
variants  of  one  another. 

Many  Correct 

The  item  has  several  conea 
answers,  some  of  which 
may  be  qualitatively 
different  from  others  and 
some  of  which  may  be 
variants  on  another  answer. 
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TaWc3 

Attributes  Associated  with  •‘Library*’  Item  (Figure  11 


AttrilHite 

Explanation 

Specialized  Diagram 

The  provided  figure  is  a  site  plan,  which  is  an  abstract 
diagram  of  the  actual  building  site.  The  site  plan  diagram 
contains  elements  that  require  ^>ecialized  knowledge  to 
interpret  (e.g.,  contour  lines,  property  lines,  symbols  for 
trees). 

Diagram  Obvious 

Based  on  die  provided  elements  and  the  operations 
available  (nx>ve,  etc.),  it  is  clear  that  the  ^neral  procedure 
for  this  ta^  is  to  phix  the  elements  somewhere  onto  die 
site. 

Discovered  Strategy 

There  is  no  clear,  algorithmic  procedure  for  placing  the 
buildings  onto  the  site.  The  examinees  must  bring  to  bear 
knowledge  learned  in  different  situations  to  the  solving  of 
this  task. 

Site  Design 

This  item  presents  a  prototypical  site  design  task. 

Identify  Street 

Recognizing  the  street  on  the  site  plan  is  important  for 
correct  placement  of  the  parking  lot. 

Contour  Lines 

Correcdy  interpreting  the  site  plan’s  contour  lines  is 
necessary  for  correa  placement  of  the  buildings  on  the  site 
(e.g.,  the  buildings  should  not  be  placed  on  the  steep 
slope,  but  on  relatively  level  ground). 

Stem  Independent 

Beyond  the  general  task  and  the  standard  “preserve  all 
trees,”  die  stem  does  not  provide  any  informatitxi  that  is 
vital  to  the  correct  soluticm  of  the  item. 

Many  Correct 

There  are  a  number  of  correct  solutions  to  this  item, 
reflecting  different  arrangenoents  of  the  buildings  on  the 
site. 

Mov^otate 

The  primary  interface  operation  in  this  task  is  moving 
elements  and  rotating  them  to  fit  better  onm  the  site. 

Own  Stop 

Based  on  their  own  knowledge,  it  is  up  to  die  examinees  to 
determine  when  they  are  finish^  responding  to  die  item. 
Nothing  in  the  stem  nor  in  the  diagram  provMes  feedback 
either  on  the  conectness  or  completeness  of  a  response. 
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Table  4 

Attribute  Assignments  to  Processing  Types 


Attribute  Class 

Attribute 

Problem-Solving  Process  Type 

Understand 

Solve 

Check 

Characteristics  of 

Picture 

"it 

pcesenled  figure 

Diagram 

X 

X 

Diagram  Obvious 

X 

X 

X 

Problem-solving 

Learned  Algorithm 

X 

requirements  of  item 

Declarative 

X 

X 

Discovered  Strategy 

X 

X 

Content  area 

Site  Design 

k 

Stractural  Technology 

X 

Structural  Tech.  (Lateral  Fot<»s) 

X 

Materials  and  Methods 

X 

Construction  Documents 

X 

Particular  architectural 

Identify  Street 

X 

features 

Environment 

X 

X 

Contour  Lines 

X 

X 

Forces 

X 

X 

Relation  between  stem 

Stem  Independent 

X 

and  problem-solving 

Stem  Dependent 

X 

Initial  Info  in  Stem 

X 

Interim  Info,  in  Stem 

X 

Number  of  conect 

X 

responses 

Few  Correct 

X 

X 

_ Many  Correct _ X 

General  problem-solving  Read  and  Translate  X 

approach 

_ Indicate  Location  of  New  Feature  X 

Response  method  Move/Rotate 

Label 
Draw  Line 

_ Draw  Arrow _ 

Ctmipletion  Criteria  Own  Stop 

Diagram  Stop 
Stem  Stop 

_ Diagram  -f  Own  Stop 

Misleading  Stem  IncoTTect 

Characteristics 

_ Diagram  Incorrect _ X 


X 

X 

X 

X 


XX  XXX  XX 
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Tables 

Distribution  of  Status  Groups  bv  Proficiency 


Proficiency  H  Status  Group 


Student  Intern  Architect 

H  Column  s  M  Column  %  H  Column  % 


Low 

41 

27 

51 

10 

29 

4 

12 

Medium 

41 

17 

32 

12 

34 

12 

35 

High 

40 

9 

17 

13 

37 

18 

53 

Total 

122 

53 

100 

35 

100 

34 

100 

Table  6 
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Protocol  Encoding  Categories 


UjidCTstand 

Rftarf  stffTTi!  statements  involving  the  reading  of  the  problem  stem.  Read  statements  include 
verbatim  readings  of  the  stem  as  well  as  partial  reading  of  the  problem  stem. 

Scan  diagram:  statements  involving  the  provided  diagram.  Diagram  statements  include  verbatim 
readings  of  verbal  infOTmadon  as  well  as  verbal  descripdcms  of  information  in  the  diagram  (e.g., 
‘‘lateral  forces  coming  that  wa/O- 

Relate:  statements  regarding  how  the  problem  or  parts  of  the  problem  relate  to  the  examinee’s  own 
knowledge.  Relate  statements  ccaisist  of  several  Qrpes  of  verbalizations  including  verbalizations 
regarding: 

-  an  expectation  or  the  violation  of  an  expectation  (e.g.,  “Normally  there  would  be  more 
lines  on  this  window  drawing”) 

-  recognition  of  the  problem  (or  part  of  the  problem)  as  of  a  particular  type  (e.g.,  “This  is  a 
site  vignette,”  “this  is  a  perspective  drawing”) 

-  predictions  as  to  the  difficult  of  the  problem  (e.g.,  “this  will  take  a  while”) 

-  the  defiititicms  or  ambiguity  of  sections  of  the  problem  (e.g.,  “is  it  an  awning  or  a 
hopper?”,  “most  sheathing  I  know  of  is...”) 

Solve 

Goal:  stating  an  intent  or  future  action.  Goal  statements  are  often  stated  in  the  future  tense  or  in 
terms  of  “should  be.” 

Perform:  statements  regarding  the  performance  of  an  action.  Perform  statements  are  usually  stated 
in  the  present  or  “continuing”  tense  (e.g.,  “that  dips  here”).  Perform  statements  relate  only  to 
physic^  actions  such  as  moving  a  block  on  the  screen  or  locating  a  particular  item  in  the  t^gram 
(for  the  latter,  e.g.,  “this  is  a  flat  area”).  It  may  be  difficult  to  distinguish  between  goal  and  perform 
statements. 


Check 

Evaluate-COTrect:  statements  regarding  the  correctness  of  a  performed  action  or  the  result  of  that 
action  (e.g.,  the  location  of  a  placed  object).  Evaluate-correct  statements  should  only  refer  to  the 
examinee’s  own  actions  or  answers,  not  to  the  problem  itself.  These  statements  may  either  reflect 
judging  the  cotrecmess  of  an  action  (e.g.,  “is  that  right?”)  or  reflect  the  outcomes  of  a  judgment 
(e.g.,  “that  isn’t  what  I  wanted  to  do”). 

Evaluate-complete;  statements  suggesting  that  some  action  or  goal  has  been  completed.  As  with 
evaluate-correa  statements,  evaluate-con^lete  statements  include  verbalizations  judging  if 
something  has  been  finished  (e.g.,  “is  there  anything  else  to  be  done?”)  as  well  as  verbalizations 
concerning  the  results  of  such  judgments  (e.g.,  “that’s  it,”  “that  was  easy”). 
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Tabic  7 

aassification  Results  for  Subjects  Grouped  bv  Profidencv  and  Status 


Group  N  Problem-Solving  Process  Type 


Understand  Solve  Check 

No.  aassified  %  No^Classifigd  Sl  Ng.  Oassifigd  Sl 


Proficiency 

Low 

41 

32 

78 

41 

100 

25 

61 

Medium 

41 

29 

71 

40 

98 

13 

32 

ffigh 

40 

20 

50 

33 

83 

12 

30 

Status 

Student 

53 

37 

70 

52 

98 

25 

47 

Intern 

35 

23 

66 

32 

91 

11 

31 

Architect 

34 

21 

62 

30 

88 

14 

41 

Total 

122 

81 

66 

114 

93 

50 

41 
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Tables 

Analysis  of  Variance  Results  for  Attribute  Mastery  Data 


Group 

Problem-Solying  Process  Type 

Understand 

p-yalue 

Solye 

p-yalue 

Check 

p-yalue 

Between  Subjects 

Rroficien^ 

.0001 

.0001 

.0001 

Status 

.5621 

.1948 

.3433 

Prc^ciency  x  Status 

.1343 

.4707 

.7231 

Within  Subjects* 

Attribute 

.0001 

.0001 

.0001 

Attribute  x  Proficiency 

.0013 

.0001 

.0130 

Attribute  x  Status 

.0885 

.0874 

.4287 

Attr.  X  Prof,  x  Status 

.4743 

.0535 

.1029 

&  p-yalues  for  within-subject  effects  were  calculated  using  WiL  Lambda. 
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Extending  the  Rule  Space  Model 


Table  9 


Attribute  Proficiency  Overall  Mean 


Low 

Medium 

High 

Many  correct 

.35 

.28 

.35 

.34 

Draw  Arrow 

.44 

.38 

.33 

.38 

Mofve/Rotate 

.31 

.39 

.61 

.44 

T  Jihri 

.43 

.66 

.88 

.66 

Enviraunmt 

.47 

.60 

.97 

.68 

Ccmtour  Lines 

.61 

.78 

.83 

.74 

Forces 

.70 

.64 

.88 

.74 

Identify  Street 

.75 

.76 

.84 

.78 

Interim  Info 

.67 

.82 

.92 

.80 

Diagram  Obvious 

.70 

.76 

.94 

.80 

Own  Obvious 

.81 

.83 

.86 

.83 

Few  Correct 

.66 

.91 

.98 

.85 

Discovered  Strategy 

.79 

.89 

.98 

.89 

Ind.  Location 

.75 

1.00 

1.00 

.92 

Read  +  Translate 

.86 

.98 

.98 

.94 

Declarative 

.87 

.97 

1.00 

.95 

Learned  Algorithm 

.92 

.97 

.98 

.96 

Stem  Independent 

.89 

.99 

1.00 

.96 

Stem  Dependent 

.93 

1.00 

1.00 

.98 

Draw  line 

.98 

1.00 

1.00 

.99 

Overall  Mean 


.69 


.78 


.87 


.78 
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State 
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